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The following source coding problem was introduced by Birk and Kol: a sender holds a word 

x £ {0, 1}", and wishes to broadcast a codeword to n receivers, Ri, . . . , R n . The receiver Ri is 

interested in Xj, and has prior side information comprising some subset of the n bits. This 

corresponds to a directed graph G on n vertices, where ij is an edge iff Ri knows the bit Xj. An 

index code for G is an encoding scheme which enables each Ri to always reconstruct Xi, given 

his side information. The minimal word length of an index code was studied by Bar-Yossef, 

Birk, Jayram and Kol [3]. They introduced a graph parameter, minrk2(G), which completely 

characterizes the length of an optimal linear index code for G. The authors of [4] showed that 

| in various cases linear codes attain the optimal word length, and conjectured that linear index 

coding is in fact always optimal. 

In this work, we disprove the main conjecture of [4] in the following strong sense: for any 

^ ' e > and sufficiently large n, there is an n-vertex graph G so that every linear index code for 

G requires codewords of length at least n 1 ^ 6 , and yet a non- linear index code for G has a word 

length of n e . This is achieved by an explicit construction, which extends Alon's variant of the 

celebrated Ramsey construction of Frankl and Wilson. 

\Q . In addition, we study optimal index codes in various, less restricted, natural models, and 

' prove several related properties of the graph parameter minrk(G). 
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Source coding deals with a scenario in which a sender has some data string x he wishes to transmit 
through a broadcast channel to receivers. The first and classical result in this area is Shannon's 
Source Coding Theorem. This has been followed by various scenarios which differ in the nature 
of the data to be transmitted, the broadcast channel and some assumptions on the computational 
abilities of the users. Another family of source coding problems, which attracted a considerable 
amount of attention over the years, deals with the assumption that the receivers possess some prior 
knowledge on the data string x. It was shown that in some cases even some restricted assumptions 
on this knowledge may drastically affect the nature of the coding problem. 

In this paper we consider a variant of source coding which was first proposed by Birk and Kol 
[5]. In this variant, called Informed Source Coding On Demand (ISC0D), each receiver has some 
prior side information, comprising some subset of the input word x. The sender is aware of the 
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portion of x known to each receiver. Moreover, each receiver is interested in just part of the data. 
Following [3], we restrict ourselves to the problem which is formalized as follows. 

Definition 1 (index code). A sender wishes to send a word x G {0, l} n to n receivers R%, . . . , R n . 
Each Ri knows some of the bits of x and is interested solely in the bit X{. An index code of length 
£ for this setting is a binary code of word-length I, which enables Ri to recover Xi for any x and i. 

Using a graph model for the side-information, this problem can be restated as a graph parameter. 
For a directed graph G and a vertex v, let Nq(v) be the set of out-neighbors of v in G, and for 
x G {0, 1}™ and S C [n] = {1, . . . , n}, let x\s be the restriction of x to the coordinates of S. 

Definition 2 (£(G)). The setting of Definition^ is characterized by the directed side information 
graph G on the vertex set [n], where (i,j) is an edge iff Ri knows the value of Xj. An index code of 
length I for G is a function E : {0, l} ra — ► {0, l} e and functions D\, . . . , D n , so that for all i G [n] 
and x G {0, 1}™, D^(E(x) , x\ N + ^) = Xj. Denote the minimal length of an index code for G by 1(G). 

Example: Suppose that every receiver Ri knows in advance the whole word x, except for the 
single bit Xi he wishes to recover. The corresponding side information graph G is the complete 
graph K n (that is, is an edge for all i ^ j). By broadcasting the XOR of all the bits of x, 

each receiver can easily compute its missing bit: 

n 

E(x) = © xi , 
i=i 

Di(E(x),x\ {j:j ^ i} ) = E(x) (0 xj) = Xi . 

In this case the code has length i = 1 and E is a linear function of x over GF(2). 

The problem of Informed Source Coding On Demand (ISCOD) was presented by Birk and Kol 
[5]. They were motivated by various applications of distributed communication such as satellite 
communication networks with caching clients. In such applications, the clients have limited storage 
and maintain part of the transmitted information. Subsequently, the clients receive requests for 
arbitrary information blocks, and may use a slow backward channel to advise the server of their 
status. The server, playing the role of the sender in Definition [H then broadcasts a single transmis- 
sion to all clients (the receivers) . As observed by Birk and Kol , when the sender has only partial 
knowledge of the side information (e.g., the number of missing blocks for each user), an erasure 
correcting code such as Reed-Solomon Code performs well. This is also the case if every user is 
expected to be able to decode the whole information. The authors of [5] present some bounds 
and heuristics for obtaining efficient encoding schemes, as well as protocols for implementing the 
above scenario. See [5] and [4] for more details on the relation between the source coding problem, 
as formulated above, and the ISCOD problem, as well as the communication complexity of the 
indexing function, random access codes and network coding. 

Bar-Yossef, Birk, Jayram and Kol [4] further investigated index coding. They showed that this 
problem is different in nature from the well-known source coding problems previously studied by 
Witsenhausen in |12j . Their main contribution is an upper bound on 1(G), the optimal length of an 
index code (Definition [2|) . The upper bound is a graph parameter denoted by minrk2(G), which is 
also shown to be the length of the optimal linear index code. It is shown in [4] that in several cases 
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linear codes are in fact optimal, e.g., for directed acyclic graphs, perfect graphs, odd cycles and 
odd anti-holes. An information theoretic lower bound on &{G) is obtained: it is at least the size of 
a maximal acyclic induced subgraph of G. This lower bound holds even for the relaxed problem of 
randomized index codes, where the sender is allowed to use (public) random coins during encoding, 
and the receivers are expected to decode their information correctly with high probability over 
these coin flips. Nevertheless, they show that in some cases the lower bound is not tight. 

Having proved that the upper bound 1(G) < minrk^G') is tight for several natural graph 
families and under some relaxed restrictions on the code ( "semi-linear ly-decodable" ) , the authors 
of [U conjectured that the length of the optimal index code is in fact equal to minrk^G 1 ). That is, 
they conjectured that linear index coding is always optimal, and concluded that this was the main 
open problem to be investigated. 

Before stating the main results of this paper, we review the definition of minrk2(G) and other 
related graph theoretic parameters. 

1.1 Definitions, notations and background 

Let G = (V, E) be a directed graph on the vertex set V = [n]. The adjacency matrix of G, denoted 
by Aq = (dij), is the n x n binary matrix where a%j = 1 iff £ E. An independent set of G is 
a set of vertices which have no edges between them, and the independence number of G, a(G), is 
the cardinality of a maximum independent set. The chromatic number of G, x(G), is the minimum 
number of independent sets whose union is all of V. Let G denote the graph complement of G. 
A clique of G is an independent set of G (i.e., a set of vertices such that all edges between them 
belong to G), and the clique number of G, cu(G), is the cardinality of a maximum clique. Without 
being formal, a graph G is called "Ramsey" if both a(G) and ui(G) are "small". 

In [5], a binary n x n matrix A = (aij) was said to "fit" G if A has 1-s on its diagonal, and 
in all the indices i,j where i ^ j and (i,j) ^ E. The parameter minrk2(G) was defined to be the 
minimal possible rank over GF(2) of a matrix which fits G. 

To extend this definition to a general field, let A = (atj) be an n x n matrix over some field F. 
We say that A represents the graph G over F if an ^ for all i, and a™ = whenever i ^ j and 
E. The minrank of a directed graph G with respect to the field F is defined by 

minrkir(G) = minjrankF^) : A represents G over F} . 

For the common case where F is a finite field, we abbreviate: 

minrk p fc(G) = minrk GF ( p fc)(G) . 

The notion of minrk(G) for an undirected graph G was first considered in the context of graph 
capacities by Haemers [S],[2J. The Shannon capacity of the graph G, denoted by c(G), is a noto- 
riously challenging parameter, which was defined by Shannon in [11], and remains unknown even 
for simple graphs, such as Cj, the cycle on 7 vertices. Lower bounds for c(G) are given in terms of 
independence numbers of certain graphs, and in particular, a(G) < c(G). Haemers showed that for 
all F, minrkir(G) is sandwiched between c(G) and x(G), the chromatic number of the complement 
of G, altogether giving 

q(G) < c(G) < minrk F (G) < X (G) . (1) 
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While minrkF(G) can prove to be difficult to compute, the most useful upper bound for c(G) is 
$(G), the Lovasz ^-function, which was introduced in the seminal paper [10] to compute c(C^). 
The matrix-rank argument was thereafter introduced by Haemers to answer some questions of [10J , 
and has since been used (under some variants) in additional settings to obtain better bounds than 
those provided by the function (cf., e.g., pQ). 

1.2 New results 

The main result of this paper is an improved index coding scheme, which is shown to strictly improve 
upon the minrk2(G) bound. This disproves the main conjecture of [2] regarding the optimality of 
linear index coding, as stated by the following theorem. 

Theorem 1.1. For any e > and any sufficiently large n, there is an n-vertex graph G so that: 

1. Any linear index code for G requires n 1_e bits, that is, minrk2(G) > n 1 ^ 6 . 

2. There exists a non-linear index code for G using n £ bits, that is, £(G) < n e . 

Moreover, the graph G is undirected and can be constructed explicitly. 

Note that this in fact disproves the conjecture of Bar-Yossef et al. in the following strong 
sense: the ratio between an optimal code and an optimal linear code over GF(2) can be n 1 " *- 1 ). 
The essence of the proof lies in the fact that, in some cases, linear codes over higher-order fields 
may yield significantly better index coding schemes. The term "linear codes over GF(p)" is used 
here to describe a coding scheme, in which the input word is encoded into a sequence of linear 
functionals of its symbols over GF(p), which are subsequently used for the decoding (the protocol 
for transmitting these functionals need not be linear). However, as the next theorem demonstrates, 
even this extended family of index codes may be suboptimal. 

Theorem 1.2. For any e > and any sufficiently large n, there is an n-vertex graph G so that: 

1. Any linear index code for G over some field F requires y/n symbols, that is, minrkf (G) > \fn. 

2. There exists a non-linear index code for G using n £ bits, that is, £(G) < n £ . 

Moreover, the graph G is undirected and can be constructed explicitly. 

In order to prove the above two theorems, we introduce the following upper bound on 1(G), 
which is a simple extension of a result of [2] (the special case F = GF(2)), and is a special case of 
Proposition 12.11 (Section [2]) . 

1(G) < min [ minrk F (G) log 2 |F| ] . (2) 

F: |F|<oo 

The proof of Theorem 11.11 relies on the fact that for some graphs, the minimum of ([2]) is attained 
when F ^ GF(2), in which case the linear code over GF(2) is suboptimal. Proposition 12.21 (Section 
[2]) provides a construction of such graphs, and is the main ingredient in the proof of Theorem II .li 
This proposition, which may be of independent interest, states that for any pair of finite fields with 
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distinct characteristics, F and K, the gap between minrkf and minrkK can be n 1 Theorem 
11.11 is then obtained as a corollary of ([2]) and a special case of Proposition 12.21 

Moreover, as Theorem 11.21 shows, the upper bound of ([2]) is not always tight. To see this, we 
combine the construction in the above mentioned Proposition 12.21 with some additional ideas. 

As an additional corollary, Proposition 12.21 yields that minrkf (G)/$(G) (where i9 is the Lovasz 
^-function and | V^(C)| = n) is in some cases (roughly) at least y/n, whereas in other cases it is 
(roughly) at most 1/y/n. This addresses another question of [4J on the relation between these two 
parameters. The relation between 1(G) and the Shannon capacity of G, c(G), is addressed as well, 
as a by-product of the proof of Theorem 11.21 

We also extend the main construction of Proposition 12.21 and give, for any prescribed set of 
finite fields {Fj} and an additional finite field K of a distinct characteristic, a construction of a 
graph G so that minrkjv(G') is "large" for all i, whereas minrkK(G) is "small". 

Proposition 1.3. For any fixed t, let Fi, . . . ,Fj denote finite fields, and let K denote a finite field 
of a distinct characteristic. For any e > and a sufficiently large n, there is an explicit construction 
of a graph G on n vertices, so that minrk]K(G) < n 6 , whereas for all i 6 [t], minrkF 4 (G) > n 1_e . 

In the second part of this paper, we revisit the problem definition. It is shown that the restricted 
problem given in Definition Q] captures many other cases arising from the original distributed 
applications, which motivated the study of Informed Source Coding On Demand. In particular, 
we suggest appropriate models and reductions for cases in which multiple users are interested in 
the same bit, there are multiple rounds of transmission and the transmitted words are over a large 
alphabet. These models are obtained as natural extensions of the original problem, and exhibit 
interesting relations to the parameters 1(G) and minrk(G). 

1.3 Techniques 

A key element in the proof of the main result is an extended version of the Ramsey graph constructed 
by Alon [I], which is a variant of the well-known Ramsey construction of Frankl and Wilson [7]. 
This graph, G Ptq for some large primes p, q, was used by Alon in order to disprove an old conjecture 
of Shannon [11] on the Shannon capacity of a union of graphs. 

Using some properties of the minrk parameter, one can show that the graph G P:q has a "small" 
minrkp and a "large" minrk g , implying that the optimal linear index code over GF(p) may be 
significantly better than the one over GF(q). However, it is imperative in the above construction 
that both p and q will be large, whereas we are interested in the case q = 2, corresponding to 
minrk2. To this end, we extend the above construction of JI] to prime-powers, using some classical 
results on congruencies of binomial coefficients. This allows omitting the requirement that p, q 
should be large, by taking sufficiently large powers of arbitrary distinct primes p and q. 

Using variants of the above construction, we extend the results to multiple fields, to obtain 
Theorem 1 1 . 2 1 and Proposition [L3] En route, we derive several properties of the minrank parameter, 
which may be of independent interest. 

The proofs of the results throughout the paper combine arguments from Linear Algebra and 
Number Theory along with some additional ideas, inspired by the theory of graph capacities under 
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the strong graph product definition. 
1.4 Organization 

The rest of the paper is organized as follows. Section [2] contains a description of the basic construc- 
tion, and the proof of Theorem 11.11 The extension of this result to multiple fields, including the 
proof of Theorem 11.21 appears in Section [3j In Section 01 we study the various extensions of the 
original problem. Section [5] contains some concluding remarks and open problems. 



2 Linear index codes over higher-order fields 

In this section, we prove Theorem 1 by constructing graphs for which a given linear index code 
over a higher-order field outperforms all linear index codes over GF{2). 

2.1 Proof of Theorem PTTT1 

The first ingredient in the proof is a linear index coding scheme, which is an extension of the ideas 
in [3] for larger fields. This notion is formulated in the next proposition, whose proof appears in 
Subsection 12.21 

Proposition 2.1. Let G be a graph, and let A be a matrix which represents G over some field 
¥ (not necessarily finite). Then 1{G) < [ log 2 \{Ax : x € {0, l} n }| ] . In particular, the following 
holds: 

1(G) < min r minrk F (G)log 2 |F| 1 . 

F : |F|<oo 

The second and main ingredient in the proof of Theorem 11.11 is Proposition 12.21 whose proof 
appears in Subsection 12.31 Here and in what follows, all logarithms are in the natural base unless 
stated otherwise. 

Proposition 2.2. Let ¥ and K denote two finite fields with distinct characteristics. There is an 
explicit construction of a family of graphs G = G(n) on n vertices, so that 



minrkir(G) < exp (2 + o(l)) log n log log n 

= n o(1) , (3) 

and yet: 

minrkK(G) > n/ exp ^a/ (2 + o(l)) log n log log nj 

= n l -°^ , (4) 
where the o(l)-terms tend to as n — > oo. 

In order to derive Theorem 11.11 from Propositions 12.11 and 12.21 apply Proposition 12.21 setting 
F = GF(p) and K = GF(2), where p > 2 is any fixed (odd) prime. Let e > 0; for any sufficiently 
large n, the graph obtained above satisfies 



minrk2(G) > nj exp (O ( y log n log log n) ) > n 1 " 



6 



and hence, by the results of [3J, any linear index code over GF(2) requires a word length of at least 
n 1_e bits. On the other hand, Proposition 12.11 implies that 



1(G) < rminrk p (G)log 2 (p)l 



< exp(0(-\/log n log logra)) < n £ . 



2.2 Proof of Proposition 12.11 

Let V = [n] denote the vertex set of G, A = (a^) denote a matrix which represents G over some 
field IF (not necessarily finite), and S = {Ax : x £ {0,1}™} C F n . For some arbitrary ordering 
of the elements of S, the encoding of x £ {0, l} n is the label of Ax, requiring a word-length of 
|~log 2 bits. For decoding, the i-th receiver Ri examines (Ax)i, and since the diagonal of A does 
not contain zero entries by definition, we have: 



where the last equality is by the fact that A represents G. As Ri knows {xj : j £ iVg(i)}, this 
allows Ri to recover a^. Therefore, indeed £(G) < |~log 2 \S\~\. 

To conclude the proof, note that in case F is finite, we have IS 1 ] < |F| rankF ( A ), as required. 
Furthermore, in this case it is possible to use a linear code utilizing the same word-length. The 
sender transmits a binary-encoding of the inner-products (u% ■ x, . . . , u r ■ x) G F r , where {u\ , . . . , u r } 
is a basis for the rows of A over F. ■ 

Remark 2.3: As proved for the case F = GF(2) in [3], it is possible to show that the above 
bound is tight for the case of linear codes over F. That is, the length of an optimal linear index 
code over a finite field F is [minrkip log 2 |F|] . 

2.3 Proof of Proposition [2721 

We first consider the case F = GF(p) and K = GF(q) for distinct primes p and q. Let e > 0, and 
let k denote a (large) integer satisfying^ 




j 




(5) 



q l < p k < (1 + e)q l , where I = [k\og q p\ . 



(6) 



Define: 



s = p k q l - 1 and r = p 3k . (7) 



x It is easy to verify that there are infinitely many such integers k, as p, q are distinct primes, and hence the set 
{klog q p (mod is dense in [0, 1]. 
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The graph G on re = vertices^] is defined as follows. Its vertices are all s-element subsets of [r], 
and two vertices are adjacent iff their corresponding sets have an intersection whose cardinality is 
congruent to — 1 modulo p k : 



V(G) = (W) , (8) 
I \XV\Y\ = -\ (mod p k ). 



(X,Y)eE(G) 



For some integer d to be determined later, define the inclusion matrix M d to be the ( r ) x (^) binary 
matrix, indexed by all s-element and (i-element subsets of [r], where {M<i)a,b = 1 iff B C A, for 
all A G ('^) and -B G (^). Notice that the n x re matrix M ( i(M c i) T satisfies the following for all 
A,B£V (not necessarily distinct): 



{M d {M d ) T ) AyB 



d 



X E [j] ■ X (Z (Ad B) 



AnB 
d 



0) 



Define P = M p fc_ 1 (M p fc_ 1 ) T and Q = M q i_ 1 (M q i_ 1 ) T . We claim that P represents G over GF(p) 
whereas Q represents G over GF(q). To see this, we need the following simple observation, which 
is a special case of Lucas's Theorem (cf., e.g., [6j) on congruencies of binomial coefficients. It 
was used, for instance, in [3] for constructing low-degree representations of OR functions modulo 
composite numbers, as well as in [7]. 

Observation 2.4. For every prime p and integers i,j,e with i < p e , I . 1 = 1.1 (mod p) . 



il \i 



Consider some A G V; since s satisfies both s = (p k — 1) (mod p k ) and s = (q l — 1) (mod q l 
combining ([9]) with Observation 12.41 gives 

Pa,a = ( h S , ) = 1 (mod p) , 



yp k — 1 

and 



Qa,A = (, S _ ^\ = 1 (mod 



Thus, indeed the diagonal entries of P and Q are non-zero; it remains to show that their (^4, B)- 
entries are wherever A, B are distinct non-adjacent vertices. To this end, take A, B G V so that 
A + B and AB $ E(G); by flS}, \A D B\ ^ -1 (mod p fc ), hence 

P J 4, B =^ fc _ 1 l J=0 (modp), 

the last equivalence again following from Observation 12.41 as ( k x _A = for all x G {0, . . . ,p k — 2}. 
Finally, suppose that A,B £ V satisfy ^4 ^ 5 and AS ^ £(G). That is, AP G -E(G), hence 



2 By well known properties of the density of prime numbers, and standard graph theoretic arguments, proving the 
assertion of the proposition for these values of n in fact implies the result for any n. 
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\A Pi B\ = — 1 (mod p k ). The Chinese Remainder Theorem now implies \A Pi B\ j£ — 1 (mod q l ), 
otherwise we would get \A D B\ = s and A = B. Since (.1*2) = for all x G {0, ...,</ — 2}, we get 

/|An5|\ , , , 

<2a_b = ( g /_ 1 )= ( mod 5) • 

Altogether, P represents G over GF(p), and Q represents G over GF(q). Therefore, minrk p (G) is 
at most rankp(P) < rank p (Af p f ; _ 1 ), and similarly, minrk g (G) is at most rankg(Q) < rank 9 (M (? ;_ 1 ). 

As has (^) columns, n = ( p fc ,) and q l < p k < (l + e)q l , a straightforward calculation now gives: 



minrk.K/i < ( ^_ J 



< exp (v(l + e + o(l))21ognloglogn) , 
minrk^G) < ( t [_ 



< exp ( y (1 + e + o(l))21ognloglogn 
The next simple claim relates minrk 9 (G) and minrkq(G): 

Claim 2.5. For am/ graph G on n vertices and any field F, minrkp(G) • minrk]p(G) > n. 

Proof. We use the following definition of graph product due to Shannon [llj: G\ x G2, the strong 
graph product of Gi and G2, is the graph whose vertex set is V(G\) x V(G2), where two distinct 
vertices («i,it2) 7^ (^1,^2) are adjacent iff for all i G {1,2}, either m = Vi or (ui,Vi) G E{Gi). 

As observed by Haemers j8], if and j42 represent G\ and G2 respectively over F, then the 
tensor product A\ ® A2 represents G\ x G2 over F. To see this, notice that the diagonal of A\ ® A2 
does not contain zero entries, and that if (141,1x2) 7^ (1^1,1^2) are disconnected vertices in G\ x G2, 
then by definition (Ai)r ultVl \(A2)c U2V2 ) = 0, since in this case for some i G {1,2} we have U{ ^ Vi 
and UiVi ^ E{Gi). Letting A\ and A2 denote matrices which attain minrkF(G) and minrk]jr(G) 
respectively, the above discussion implies that: 



minrk F (G x G) < rank(^i ® A 2 ) 

= minrkF(G) • minrkF(G) 



However, the set {(u,u) : u G V(G)} is an independent-set of G x G, since for u ^ v, either 
uv G E(G) and uv £ E{G) or vice versa. Therefore, (pQ) gives minrkp(G x G) > a(G x G) > n, 
completing the proof of the claim. ■ 

This concludes the proof of the proposition for the case F = GF(p), K = GF(q), where p, q are 
two distinct primes. The generalization to the case of prime-powers is an immediate consequence 
of the next claim: 

Claim 2.6. Let G be a graph, p be a prime and k be an integer. The following holds: 

— minrkp(G) < minrk p fc(G) < minrk p (G) . (10) 
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Proof. The statement minrk p fc(G) < minrkp(G) follows immediately from the fact that any ma- 
trix A which represents G over GF{p) also represents G over GF(p k ), and in addition satisfies 
vank p k(A) < rank p (A). 

To show that minrk p (G) < k minrk p fc (G), let V = [n] denote the vertex set of G, and let 
A = (ciij) denote a matrix which represents G over GF(p k ) with rank r = minrk p fc(G). As usual, 
we represent the elements of GF(p k ) as polynomials of degree at most k — 1 over GF(p) in the 
variable x. Since the result of multiplying each row of A by a non-zero element of GF(p k ) is 
a matrix of rank r which also represents G over GF{p k ), assume without loss of generality that 
an = 1 for all i S [n]. By this assumption, the n x n matrix B = (bij), which contains the free 
coefficients of the polynomials in A, represents G over GF(p). To complete the proof, we claim 
that rank p (i?) < kr. This follows from the simple fact that, if {u\, . . . ,u r } is a basis for the rows 
of A over GF(p k ), then the set Ui=i{ u «> x ' u ii ■ ■ ■ > l ■ ^i} spans the rows of A when viewed as 
fen-dimensional vectors over GF(p). ■ 

This concludes the proof of Proposition 12.21 ■ 

Remark 2.7: Alon's Ramsey construction [1] is the graph on the vertex set V = (^), where 
r = p 3 and s = pq — 1 for some large primes p ~ q, and two distinct vertices A, B are adjacent iff 
\A n B\ = —1 (mod p). Our construction allows p and q to be large prime-powers p k ~ q l . Note 
that the original construction by Frankl and Wilson [7] had the parameters r = q s and s = q 2 — 1 
for some prime-power q, and two distinct vertices A and B are adjacent iff \Af] B\ = — 1 (mod q). 

Remark 2.8: Another corollary of Proposition 12.21 is that the ratio between minrkF(G) and $(G) 
can be arbitrarily large. To see this, consider the n- vertex graph G constructed in Proposition 12.21 
for F = GF(p) and K = GF(q), where p and q are two distinct primes: it satisfies minrk p (G) < n ^ 1 ) 
and minrk,j(G) < n ^ 1 ). Clearly, G is vertex transitive (that is, its automorphism group is closed 
under all vertex substitutions), as we can always relabel the elements of the ground set [r]. By [10] 
(Theorem 9), every vertex transitive graph G on n vertices satisfies 

&(G)d(G) = n . 

Assume without loss of generality that f?(G) > y/n > i?(G) (otherwise, switch the roles of p and q 
and of G and G). As minrk p (G) < and minrk p (G) > deduce that 

$(G) > n h 2-°^ ■ minrk p (G) , 

and yet 

minrk p (G) > n3 _ °W • ■&(&) . 

3 Outperforming linear index codes over multiple fields 

In this section we use variants of the graphs constructed in Proposition 12.21 in order to prove 
Theorem 11.21 and Proposition 11.31 
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3.1 Proof of Theorem [IT2l 



Let e > 0, and let G be the graph constructed by Proposition 12.21 for F = GF(2), K = GF(S), and 
a sufficiently large n such that miml^G) < n e l 2 and minrk^G) < n £ l 2 . Let H denote the graph 
G + G, that is, the disjoint union of G and its complement. We claim that 

1(H) < 3n £/2 , and yet c(H) > ^2n = y/\V(H)\ . 

To see this, observe that in order to obtain an index code for a given graph, one may always 
arbitrarily partition the graph into subgraphs and concatenate their individual index codes: 

Observation 3.1. For any graph G and any partition of G to subgraphs Gi, . . . , G r (that is, Gi is 
an induced subgraph of G on some Vi, and V = UiVi), we have 1(G) < YH=i^(^i)- 

In particular, in our case, by combining the above with Proposition 12.11 we have 

t(H) < 1(G) + 1(G) < n e/2 + [log 2 3n e/2 ] < 3n e/2 . 

Finally, label the vertices of G as {v±, . . . , v n } and the corresponding vertices of G as {v[, . . . , v' n }. 
Following the arguments of the proof of Claim 12.51 it is easy to verify that the set of vertices 
{(vi, v'j) : i G [n]} U {(v[, Vi) : i G [n]} is an independent set of size 2n in G x G + G x G, which is 
an induced subgraph of H x H. Therefore, c(H) > \/2n. ■ 

Remark 3.2: A standard argument gives a slight improvement in the above lower bound on 
c(H), to c(H) > 2y/n. See, e.g., [1] (proof of Theorem 2.1) for further details. 

3.2 Proof of Proposition 1X151 

Notice that minrk p e(G') < minrk p d(G) for any prime p and integers e > d. Therefore, we can 
assume without loss of generality that all the Fj-s are fields with pairwise distinct characteristics. 
Let G{ denote the graph obtained by applying Proposition 12.21 on IK and Fj, so that: 

minrk K (Gj) < n 6 ^ 2 and minrk F .(Gi) > n 1_£ / 2 , 

and let G = Yll=i Gi be the disjoint union of these graphs. Since the adjacency matrix of G is a 
diagonal block-matrix of the adjacency matrices corresponding to the individual Gj-s, we obtain 
that 

t 

minrkK(G) = minrkK(G^) < tn e l 2 < n £ , 
i=l 

Clearly, for every i, minrk]f 4 (G) > minrkp^Gj), completing the proof. ■ 

4 The problem definition revisited 

Call the problem of finding the optimal index code, as defined in Definition [H Problem [TJ At first 
glance, Problem 1 seems to capture only very restricted instances of the source coding problem for 
IS COD, and its motivating applications in communication. Namely, the main restrictions are: 
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(1) Each receiver requests exactly one data block. 

(2) Each data block is requested only once. 

(3) Every data block consists of a single bit. 

In [5|, where Definition [1] was stated, it is proved that the source coding problem for ISCOD can 
be reduced to a similar one which satisfies restriction (pQ). This is achieved by replacing a user 
that requests k > 1 blocks by k users, all having the same side information, and each requesting a 
different block. On the other hand, restriction ([2]) appeared in [5] to simplify the problem and to 
enable the side-information to be modeled by a directed gra ph@ Restriction ([3]) is stated assuming 
a larger block size does not dramatically effect the nature of the problem. In what follows, we aim 
to reconsider the last two restrictions. 

4.1 Larger alphabet and multiple rounds 

Suppose the data string x is over a possibly larger alphabet, e.g., {0, 1}' for some t > 1: 

Problem 2: The generalization of Problem [IJ where each input symbol Xi G {0, 1}' comprises a 
block of t bits. Every user is interested in a single block, and knows a subset of the other blocks. 

By considering each of the t bits of the symbol as one independent round of transmission, one 
can verify that the following formulation is equivalent: 

Problem [Jf: The generalization of Problem [1] to t > 1 rounds over the same side information 
graph G. The sender wishes to transmit t words x , . . . , x t G {0, 1}™, with the same side information 
setting. Receiver i?j is always interested in the i-th bit of the input words, xj, . . . ,x\. 

The above problem can be reduced to Problem [T] by considering the graph G[t], defined as 
follows. For some integer t, let G[t] denote the t-blow-up of G (with independent sets), that is, 
the graph on the vertex set V(G) x [t], where (it, i) and (v,j) are adjacent iff uv S E(G). Indeed, 
Problem [2] reduces to Problem [1] with the side information graph G[t], by assigning a receiver to 
each of the data bits. Therefore, this extension is in fact a special case of the original seemingly 
restricted problem. 

Clearly, one may choose to treat each round of transmission independently, at a total cost of 
t-£(G) transmitted bits, thus £(G[t]) < t-£(G). The next remark shows that this bound is sometimes 
tight: 

Remark 4.1: If an undirected graph G satisfies £{G) = a(G) (this holds, e.g., for all graphs 
satisfying a(G) = x(G), and namely for perfect graphs), then = t • £(G), as 

t ■ £{G) > £{G[t}) > a(G[t]) = t ■ a{G) = t ■ £{G) . 

However, as the next remark states, one may indeed save on communication when sending a 
unified transmission for the entire set of rounds (or block of symbols): 

3 It followed the observation that if the same block is requested by several receivers, then most of the communication 
saving comes from transmitting this block once (duplicate elimination). 
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Remark 4.2: In a subsequent work [2], we show that there are graphs for which ^(G[t]) < t-£(G). 
That is, transmission of t rounds may strictly improve upon the performance of t independent 
transmissions. This justifies the study of the index coding rate defined by 

t^CO t 

(the limit exists by sub-additivity). This corresponds to the average length of a codeword per 
round, when the number of rounds tends to infinity. 

A natural extension of Problem [2j is the case where the underlying side information graph 
changes between rounds: 

Problem 3: The generalization of Problem [T] to t > 1 rounds: the sender wishes to transmit 
t words x , . . . , x* G {0, 1}™, with respective side information graphs G\, . . . ,Gt- Receiver Ri is 
always interested in the i-th bit of the input words, xj, . . . ,xj. 

Even in this more general setting, a reduction to Problem Q] is possible: let G = G\ o ■ ■ ■ o G% 
denote the directed graph on the vertex set V(G) = [n] x [t], where for all i\, %i G [n] and k±, k 2 G [t], 
((iljfci), (i2,k 2 )) is an edge of G iff {11,12) G E{Gk 2 )- Again, it is straightforward to see that £{G) 
is precisely the solution for Problem O 

In the general setting of Problem [3l it is even simpler to see that independent transmissions may 
consume significantly more communication. For instance, consider the following case. We have two 
receivers, R\ and R2, and two rounds for transmitting the binary words x = x\X2 and y = y\yi- 
Suppose that in the first round receiver R\ knows X2 and in the second transmission receiver R2 
knows y\. In this case, each round - if transmitted separately - requires 2 bits to be transmitted. 
Yet, if the server transmits the 3 bits 

x\ yi , x 2 y2 , xi y 2 , 

then both receivers can reconstruct their missing bits (and moreover, reconstruct all of x and y). 

This in fact is a special case of the following construction. We define a pair of graphs G±, G2 such 
that £(Gi) = £(G2) = n and yet only t{G\ o G%) = n + 1 bits need to transmitted for consecutive 
transmissions. This is stated in the next claim, where the transitive tournament graph on n vertices 
is isomorphic to the directed graph on the vertex set [n], where (i,j) is an edge iff i < j. 

Claim 4.3. Let G\ denote the transitive tournament graph on n vertices, and let G2 denote the 
graph obtained from G\ by reversing all edges. Then £(G\) +£{G2) = 2n, and yet £{G\oG2) = n + 1. 

Proof. Without loss of generality, assume that E(G\) = {(i,j) ■ % < j} and E{G2) = {(i,j) ■ i > j}- 
Since G\ and G2 are both acyclic, the fact that £{G\) = £{G2) = n follows from the lower bound 
of [1] (£(G) is always at least the size a maximum induced acyclic subgraph of G). 

Recall that by definition, G\ o G2 is the disjoint union of Gi and G2, with the additional edges 
{((i, 1), (j, 2)) : j < i} and {((i, 2), (j, 1)) : j > i}. Therefore, G\ o G2 has an induced acyclic graph 
of size n + 1: for instance, the set {(i, 1) : i G [n]} U {(n, 2)} induces such a graph. We deduce that 
£(G 1 oG 2 ) >n+l. 
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To complete the proof of the claim, we give an encoding scheme for G\ o G2 which requires the 
transmission of n + 1 bits, hence t(G\ o G2) < n + 1. Denote the two words to be transmitted by 
X — X \ ... Xfi and y = y\ . . . y n . The coding scheme is linear: by transmitting Xi © y% for i £ [n] and 
©iefnpi, it is not difficult to see that each receiver is able to decode its missing bits (in fact, each 
receiver can reconstruct all the bits of x and y). ■ 



4.2 Shared requests 

Problem 4: The generalization of Problem [1] to m > n receivers, each interested in a single bit 
(i.e., we allow several users to ask for the same bit). 

In this case, the one-to-one correspondence between message bits and receivers no longer holds, 
thus the directed side information graph seems unsuitable. However, it is still possible to obtain 
bounds on the optimal linear and non-linear codes using slightly different models. 

Let 7^] denote an instance of Problem [41 and let ^(7^j) denote the length of an optimal index 
code in this setting. It is convenient to model the side-information of Vi using a binary m x n 
matrix, where the ij entry is 1 iff the i-th user knows the j'-th bit (if m = n, this matrix is the 
adjacency matrix of the side information graph). With this in mind, we extend the notion of 
representing the side-information graph as follows: an m x n matrix B represents 7^] over F iff for 
all i and j: 

• If the i-th receiver is interested in the bit Xj, then Bij ^ 0. 

• If the i-th receiver is neither interested in nor knows the bit Xj, then By = 0. 

Notice that in the special case m = n, the above definition coincides with the usual definition of 
representing the side- information graph. Let minrkj^T-jj]) denote the minimum rank of a matrix B 
that represents Trover F. It is straightforward to verify that results analogous to Proposition 12.11 
and Remark 12.31 hold for this extended notion of matrix representation: 

Theorem 4.4. Let Tj^j denote an instance of Problem^ Then the length of an optimal linear code 
is minrk2('£j^j) ; and the upper bounds of Theorem \2.1\ on arbitrary index codes hold for as well. 

Next, given define the following two directed m- vertex graphs G m & and G c \. Both vertex sets 
correspond to the m users, where each set of users interested in the same bit forms an independent 
set in Gi n d and a clique in G c \. In the remaining cases, in both graphs (vi,Vj) is an edge iff the 
i-th user knows the bit in which the j-th user is interested (for m = n, both graphs are equal 
to the usual side- information graph defined in Definition [2|) . The following simple claim provides 
additional bounds on ^(7^j); we omit the details of its proof. 

Claim 4.5. -tf denotes an instance of Problem^ and G; n d and G c \ are defined as above, then: 

1. £(G c i) < ^(^j)> an d i n addition, minrk]f(G c i) < minrk]f(7 : |^j) for all F. 

2. ^(^faj) — £{G- m d), and in addition, minrkF(T^j) < minrk]F(Gi n d) for all F. 



14 



5 Concluding remarks and open problems 



• In this paper we have introduced constructions of graphs for which linear index coding is 
suboptimal (Theorem [LI]), thus disproving the main conjecture of [31 ■ It is in fact shown that 
any linear index code for these n-vertex graphs requires a word length of n l ~°^ bits (barely 
improving the naive protocol which requires n bits), yet a given index code for these graphs 
utilizes words which are only bits long. 

• The graphs constructed extend Alon's variant of the Ramsey construction given by Frankl 
and Wilson. For these graphs, linear index codes over higher-order fields outperform the 
linear codes over GF{2). Furthermore, a variant of this construction (given in Theorem II .2\ 
shows that there are graphs where linear codes over any field are suboptimal. 

• The main question for further work is trying to obtain tight bounds on 1(G) for a general 
graph G. In addition, it would be interesting to determine the expected value of 1(G) for the 
random graph G ~ G(ji, |). 

• In Theorem ll.il we have constructed n-vertex graphs, where the ratio between the parameters 
minrk2(G) and £(G) was n/ exp(0(^/log n log log n)). It can be interesting to obtain an even 
larger gap between these two parameters, and namely, to show n-vertex graphs G where 
minrk2(G)/^(G) = 0(n). This may require either a different approach to the problem, or 
significantly improving the given Ramsey constructions. 

• In addition, we showed that more general scenarios of index coding, as presented in [5], can 
be reduced to the main problem, which recently attracted attention. In this context, we have 
demonstrated that one may save on communication when transmitting t binary words at 
once, rather than transmitting these words independently. We have shown this for the case 
where the underlying side-information graph is allowed to change dynamically. 

• The most interesting scenario is that of large data blocks over a fixed side-information graph. 
As in [3], we have confined ourselves in this paper mainly to the case in which each of the 
data blocks consists of a single bit. However, this analysis of index coding is relevant to 
the motivating application only if the communication which is required to coordinate the 
side information graph is negligible with respect to the size of the data blocks themselves. 
Therefore, we should, in fact, consider a scenario in which an n-word of 6-bits blocks is 
transmitted, where b ^> n. In this case, it is clearly possible to use an optimal index code for 
each bit in the block independently, transmitting b ■ £(G) bits altogether. Nevertheless, this 
protocol is not guaranteed to be optimal, which yields the following natural question: 

Is there a side information graph G on n vertices and integer b, for which transmitting an 
n-word which consists of 6-bits blocks requires less than b ■ 1(G) bits? 

Remark 5.1: After the completion of this work, with Noga Alon, we were able to answer the 
last question in the affirmative. This will appear in a subsequent work [2]. 

Acknowledgement: We are grateful to Noga Alon and Oded Regev for helpful discussions. We 
would also like to thank the FOCS 2007 program committee for helpful suggestions. 
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